CKS: Allow affinity group selection during cluster creation#12386
CKS: Allow affinity group selection during cluster creation#12386Damans227 wants to merge 41 commits intoapache:mainfrom
Conversation
…to CreateKubernetesClusterCmd
…ndling and enhance node type validation tests
…ubernetesClusterVO
…bernetesClusterManagerImpl
…lusterVO to support multiple IDs
…erviceHelper and related classes
…ting kubernetes_cluster
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16294 |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #12386 +/- ##
============================================
+ Coverage 17.76% 17.95% +0.18%
- Complexity 15859 16220 +361
============================================
Files 5923 5942 +19
Lines 530470 533525 +3055
Branches 64823 65281 +458
============================================
+ Hits 94243 95796 +1553
- Misses 425682 426982 +1300
- Partials 10545 10747 +202
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@blueorangutan package |
|
@nvazquez a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16800 |
|
@blueorangutan test |
|
@nvazquez a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
DaanHoogland
left a comment
There was a problem hiding this comment.
clgtm (afaics) , will need some 3rd person testing, but looks really promissing.
|
[SF] Trillian test result (tid-15442)
|
| @Override | ||
| public List<Long> listAffinityGroupIdsByClusterIdAndNodeType(long clusterId, String nodeType) { | ||
| List<KubernetesClusterAffinityGroupMapVO> maps = listByClusterIdAndNodeType(clusterId, nodeType); | ||
| return maps.stream().map(KubernetesClusterAffinityGroupMapVO::getAffinityGroupId).collect(Collectors.toList()); |
There was a problem hiding this comment.
I think a null check should be needed at this point, to cover cases in which the affinity group is not passed for the node type
| } | ||
|
|
||
| protected List<Long> getAffinityGroupIdsForNodeType(KubernetesClusterNodeType nodeType) { | ||
| return new ArrayList<>(kubernetesClusterAffinityGroupMapDao.listAffinityGroupIdsByClusterIdAndNodeType( |
There was a problem hiding this comment.
Here as well, can we add checks for null or empty list?
There was a problem hiding this comment.
Minor: can use CollectionUtils.isEmpty(affinityGroupIds) here
| Long nodeHostId = node.getHostId(); | ||
| String nodeHostName = getHostName(nodeHostId); | ||
|
|
||
| if ("host anti-affinity".equalsIgnoreCase(affinityGroupType)) { |
There was a problem hiding this comment.
What about adding the following checks to not rely on a string comparisson?
- Inject
AffinityGroupServiceon the class level - Add the following method on
AffinityGroupServiceinterface:protected Map<String, AffinityGroupProcessor> getAffinityTypeToProcessorMap() - Mark this method as public on
AffinityGroupServiceImpl(definition already exists) - Obtain the
AffinityGroupProcessorfor the affinity group type (adding null checks) and replace the string comparisson toprocessor instanceof HostAntiAffinityProcessorandprocessor isinstanceof HostAffinityProcessor
What do you think?
| } | ||
|
|
||
| protected void validateNewNodesAntiAffinity(List<Long> nodeIds, AffinityGroupVO affinityGroup, KubernetesCluster cluster) { | ||
| if (!"host anti-affinity".equalsIgnoreCase(affinityGroup.getType())) { |
…tesClusterResourceModifierActionWorker
|
@Damans227 a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
1 similar comment
|
@Damans227 a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
|
@blueorangutan package |
|
@Damans227 a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16863 |
There was a problem hiding this comment.
Pull request overview
This PR adds comprehensive support for affinity group selection during CKS (CloudStack Kubernetes Service) cluster creation. The feature allows users to specify different affinity groups for each node type (CONTROL, WORKER, ETCD), enabling fine-grained control over VM placement for high availability and performance optimization.
Changes:
- New database table
kubernetes_cluster_affinity_group_mapfor storing per-node-type affinity group associations - New API parameter
nodeaffinitygroupsin thecreateKubernetesClusterAPI supporting multiple affinity groups per node type - UI enhancements to allow affinity group selection for control, worker, and ETCD nodes in advanced mode
- Backend logic to merge user-selected affinity groups with ExplicitDedication groups during VM provisioning
- Comprehensive integration and unit tests covering various affinity group scenarios
Reviewed changes
Copilot reviewed 26 out of 26 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
engine/schema/src/main/resources/META-INF/db/schema-42210to42300.sql |
Creates new table for affinity group mappings with proper foreign key constraints |
api/src/main/java/org/apache/cloudstack/api/ApiConstants.java |
Adds API constants for affinity group IDs and names response fields |
api/src/main/java/com/cloud/vm/VmDetailConstants.java |
Adds AFFINITY_GROUP constant for parameter mapping |
api/src/main/java/com/cloud/kubernetes/cluster/KubernetesServiceHelper.java |
Extends interface with getAffinityGroupNodeTypeMap method |
plugins/.../KubernetesClusterAffinityGroupMapVO.java |
New entity class for affinity group mappings |
plugins/.../dao/KubernetesClusterAffinityGroupMapDao.java |
DAO interface for affinity group mapping operations |
plugins/.../dao/KubernetesClusterAffinityGroupMapDaoImpl.java |
DAO implementation with search builders |
plugins/.../KubernetesServiceHelperImpl.java |
Implements affinity group validation and ID resolution logic |
plugins/.../KubernetesClusterManagerImpl.java |
Adds affinity group persistence, response building, and node validation logic |
plugins/.../actionworkers/KubernetesClusterActionWorker.java |
Adds methods to retrieve and merge affinity groups for node types |
plugins/.../actionworkers/KubernetesClusterStartWorker.java |
Updates VM creation to use merged affinity group lists |
plugins/.../actionworkers/KubernetesClusterResourceModifierActionWorker.java |
Updates worker node creation with affinity groups |
plugins/.../actionworkers/KubernetesClusterDestroyWorker.java |
Adds cleanup for affinity group mappings on cluster deletion |
plugins/.../CreateKubernetesClusterCmd.java |
Adds affinityGroupNodeTypeMap parameter and helper method |
plugins/.../ScaleKubernetesClusterCmd.java |
Renames variable from kubernetesClusterHelper to kubernetesServiceHelper |
plugins/.../KubernetesClusterResponse.java |
Adds response fields for affinity group IDs and names |
plugins/.../spring-kubernetes-service-context.xml |
Registers new DAO bean |
ui/src/views/compute/CreateKubernetesCluster.vue |
Adds UI controls for affinity group selection with account validation |
ui/src/config/section/compute.js |
Adds affinity group fields to cluster details view |
ui/public/locales/en.json |
Adds localization strings for affinity group labels |
test/integration/component/test_kubernetes_cluster_affinity_groups.py |
Comprehensive integration tests for affinity group functionality |
plugins/.../test/.../KubernetesClusterActionWorkerTest.java |
Unit tests for affinity group retrieval and merging |
plugins/.../test/.../KubernetesServiceHelperImplTest.java |
Unit tests for affinity group validation and mapping |
plugins/.../test/.../KubernetesClusterManagerImplTest.java |
Unit tests for affinity group response building and validation |
plugins/.../test/.../KubernetesClusterAffinityGroupMapVOTest.java |
Unit tests for the VO class |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Missing localization string for error message. The code references 'message.error.affinity.groups.different.accounts' but this key is not defined in en.json. Add the following entry to ui/public/locales/en.json:
"message.error.affinity.groups.different.accounts": "Affinity groups from different accounts cannot be used together"
| message: this.$t('message.error.affinity.groups.different.accounts') | |
| message: 'Affinity groups from different accounts cannot be used together' |
| self.assertEqual( | ||
| cluster.controlnodeaffinitygroupid, | ||
| control_aff_grp.id, | ||
| "Control node affinity group ID mismatch. Expected: {}, Got: {}".format( | ||
| control_aff_grp.id, cluster.controlnodeaffinitygroupid) | ||
| ) | ||
| self.assertEqual( | ||
| cluster.controlnodeaffinitygroupname, | ||
| control_aff_grp.name, | ||
| "Control node affinity group name mismatch. Expected: {}, Got: {}".format( | ||
| control_aff_grp.name, cluster.controlnodeaffinitygroupname) | ||
| ) | ||
| else: | ||
| self.assertTrue( | ||
| not hasattr(cluster, 'controlnodeaffinitygroupid') or cluster.controlnodeaffinitygroupid is None, | ||
| "Control node affinity group should be None" | ||
| ) | ||
|
|
||
| if worker_aff_grp is not None: | ||
| self.assertEqual( | ||
| cluster.workernodeaffinitygroupid, | ||
| worker_aff_grp.id, | ||
| "Worker node affinity group ID mismatch. Expected: {}, Got: {}".format( | ||
| worker_aff_grp.id, cluster.workernodeaffinitygroupid) | ||
| ) | ||
| self.assertEqual( | ||
| cluster.workernodeaffinitygroupname, | ||
| worker_aff_grp.name, | ||
| "Worker node affinity group name mismatch. Expected: {}, Got: {}".format( | ||
| worker_aff_grp.name, cluster.workernodeaffinitygroupname) | ||
| ) | ||
| else: | ||
| self.assertTrue( | ||
| not hasattr(cluster, 'workernodeaffinitygroupid') or cluster.workernodeaffinitygroupid is None, | ||
| "Worker node affinity group should be None" | ||
| ) | ||
|
|
||
| if etcd_aff_grp is not None: | ||
| self.assertEqual( | ||
| cluster.etcdnodeaffinitygroupid, | ||
| etcd_aff_grp.id, | ||
| "ETCD node affinity group ID mismatch. Expected: {}, Got: {}".format( | ||
| etcd_aff_grp.id, cluster.etcdnodeaffinitygroupid) | ||
| ) | ||
| self.assertEqual( | ||
| cluster.etcdnodeaffinitygroupname, | ||
| etcd_aff_grp.name, | ||
| "ETCD node affinity group name mismatch. Expected: {}, Got: {}".format( | ||
| etcd_aff_grp.name, cluster.etcdnodeaffinitygroupname) | ||
| ) | ||
| else: | ||
| self.assertTrue( | ||
| not hasattr(cluster, 'etcdnodeaffinitygroupid') or cluster.etcdnodeaffinitygroupid is None, | ||
| "ETCD node affinity group should be None" |
There was a problem hiding this comment.
Test field name mismatch: The test is checking for singular fields like 'controlnodeaffinitygroupid' and 'controlnodeaffinitygroupname', but the API response only defines plural fields 'controlaffinitygroupids' and 'controlaffinitygroupnames' (which return comma-separated values). The test assertions at lines 400, 406, 420, 426, 438, and 444 will fail because these singular attributes don't exist in the KubernetesClusterResponse class.
Update the test to use the correct plural field names and handle CSV parsing. For example:
- cluster.controlaffinitygroupids instead of cluster.controlnodeaffinitygroupid
- Parse the CSV string and verify it contains the expected affinity group ID
| self.assertEqual( | |
| cluster.controlnodeaffinitygroupid, | |
| control_aff_grp.id, | |
| "Control node affinity group ID mismatch. Expected: {}, Got: {}".format( | |
| control_aff_grp.id, cluster.controlnodeaffinitygroupid) | |
| ) | |
| self.assertEqual( | |
| cluster.controlnodeaffinitygroupname, | |
| control_aff_grp.name, | |
| "Control node affinity group name mismatch. Expected: {}, Got: {}".format( | |
| control_aff_grp.name, cluster.controlnodeaffinitygroupname) | |
| ) | |
| else: | |
| self.assertTrue( | |
| not hasattr(cluster, 'controlnodeaffinitygroupid') or cluster.controlnodeaffinitygroupid is None, | |
| "Control node affinity group should be None" | |
| ) | |
| if worker_aff_grp is not None: | |
| self.assertEqual( | |
| cluster.workernodeaffinitygroupid, | |
| worker_aff_grp.id, | |
| "Worker node affinity group ID mismatch. Expected: {}, Got: {}".format( | |
| worker_aff_grp.id, cluster.workernodeaffinitygroupid) | |
| ) | |
| self.assertEqual( | |
| cluster.workernodeaffinitygroupname, | |
| worker_aff_grp.name, | |
| "Worker node affinity group name mismatch. Expected: {}, Got: {}".format( | |
| worker_aff_grp.name, cluster.workernodeaffinitygroupname) | |
| ) | |
| else: | |
| self.assertTrue( | |
| not hasattr(cluster, 'workernodeaffinitygroupid') or cluster.workernodeaffinitygroupid is None, | |
| "Worker node affinity group should be None" | |
| ) | |
| if etcd_aff_grp is not None: | |
| self.assertEqual( | |
| cluster.etcdnodeaffinitygroupid, | |
| etcd_aff_grp.id, | |
| "ETCD node affinity group ID mismatch. Expected: {}, Got: {}".format( | |
| etcd_aff_grp.id, cluster.etcdnodeaffinitygroupid) | |
| ) | |
| self.assertEqual( | |
| cluster.etcdnodeaffinitygroupname, | |
| etcd_aff_grp.name, | |
| "ETCD node affinity group name mismatch. Expected: {}, Got: {}".format( | |
| etcd_aff_grp.name, cluster.etcdnodeaffinitygroupname) | |
| ) | |
| else: | |
| self.assertTrue( | |
| not hasattr(cluster, 'etcdnodeaffinitygroupid') or cluster.etcdnodeaffinitygroupid is None, | |
| "ETCD node affinity group should be None" | |
| control_ids_csv = getattr(cluster, 'controlaffinitygroupids', None) | |
| control_names_csv = getattr(cluster, 'controlaffinitygroupnames', None) | |
| self.assertIsNotNone( | |
| control_ids_csv, | |
| "Control affinity group IDs should be present when a control affinity group is specified" | |
| ) | |
| self.assertIsNotNone( | |
| control_names_csv, | |
| "Control affinity group names should be present when a control affinity group is specified" | |
| ) | |
| control_ids = [v.strip() for v in str(control_ids_csv).split(',') if v.strip()] | |
| control_names = [v.strip() for v in str(control_names_csv).split(',') if v.strip()] | |
| self.assertIn( | |
| str(control_aff_grp.id), | |
| control_ids, | |
| "Control node affinity group ID mismatch. Expected to find ID {} in {}".format( | |
| control_aff_grp.id, control_ids_csv) | |
| ) | |
| self.assertIn( | |
| control_aff_grp.name, | |
| control_names, | |
| "Control node affinity group name mismatch. Expected to find name '{}' in '{}'".format( | |
| control_aff_grp.name, control_names_csv) | |
| ) | |
| else: | |
| self.assertTrue( | |
| not hasattr(cluster, 'controlaffinitygroupids') or not getattr(cluster, 'controlaffinitygroupids'), | |
| "Control node affinity group should be None or empty" | |
| ) | |
| if worker_aff_grp is not None: | |
| worker_ids_csv = getattr(cluster, 'workeraffinitygroupids', None) | |
| worker_names_csv = getattr(cluster, 'workeraffinitygroupnames', None) | |
| self.assertIsNotNone( | |
| worker_ids_csv, | |
| "Worker affinity group IDs should be present when a worker affinity group is specified" | |
| ) | |
| self.assertIsNotNone( | |
| worker_names_csv, | |
| "Worker affinity group names should be present when a worker affinity group is specified" | |
| ) | |
| worker_ids = [v.strip() for v in str(worker_ids_csv).split(',') if v.strip()] | |
| worker_names = [v.strip() for v in str(worker_names_csv).split(',') if v.strip()] | |
| self.assertIn( | |
| str(worker_aff_grp.id), | |
| worker_ids, | |
| "Worker node affinity group ID mismatch. Expected to find ID {} in {}".format( | |
| worker_aff_grp.id, worker_ids_csv) | |
| ) | |
| self.assertIn( | |
| worker_aff_grp.name, | |
| worker_names, | |
| "Worker node affinity group name mismatch. Expected to find name '{}' in '{}'".format( | |
| worker_aff_grp.name, worker_names_csv) | |
| ) | |
| else: | |
| self.assertTrue( | |
| not hasattr(cluster, 'workeraffinitygroupids') or not getattr(cluster, 'workeraffinitygroupids'), | |
| "Worker node affinity group should be None or empty" | |
| ) | |
| if etcd_aff_grp is not None: | |
| etcd_ids_csv = getattr(cluster, 'etcdaffinitygroupids', None) | |
| etcd_names_csv = getattr(cluster, 'etcdaffinitygroupnames', None) | |
| self.assertIsNotNone( | |
| etcd_ids_csv, | |
| "ETCD affinity group IDs should be present when an ETCD affinity group is specified" | |
| ) | |
| self.assertIsNotNone( | |
| etcd_names_csv, | |
| "ETCD affinity group names should be present when an ETCD affinity group is specified" | |
| ) | |
| etcd_ids = [v.strip() for v in str(etcd_ids_csv).split(',') if v.strip()] | |
| etcd_names = [v.strip() for v in str(etcd_names_csv).split(',') if v.strip()] | |
| self.assertIn( | |
| str(etcd_aff_grp.id), | |
| etcd_ids, | |
| "ETCD node affinity group ID mismatch. Expected to find ID {} in {}".format( | |
| etcd_aff_grp.id, etcd_ids_csv) | |
| ) | |
| self.assertIn( | |
| etcd_aff_grp.name, | |
| etcd_names, | |
| "ETCD node affinity group name mismatch. Expected to find name '{}' in '{}'".format( | |
| etcd_aff_grp.name, etcd_names_csv) | |
| ) | |
| else: | |
| self.assertTrue( | |
| not hasattr(cluster, 'etcdaffinitygroupids') or not getattr(cluster, 'etcdaffinitygroupids'), | |
| "ETCD node affinity group should be None or empty" |
c0748bb to
f06461c
Compare
|
@blueorangutan package |
|
@Damans227 a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16885 |
Description
This PR adds support for specifying affinity groups during CKS (CloudStack Kubernetes Service) cluster creation, allowing users to control VM placement for high availability.
Changes:
nodeaffinitygroupsparameter forcreateKubernetesClusterAPIDesign doc:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Allow+users+to+select+affinity+group+during+managed+CKS+cluster+creation
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
cmk based api testing:
Screen recording
Screencast.from.2026-01-13.06-47-14.mp4
Screenshots
How Has This Been Tested?
How did you try to break this feature and the system with this change?